Community Detection via Discriminant functions for Random Walks in the degree-corrected Stochastic Block Model

نویسنده

  • Stephen Ragain
چکیده

Recent work theoretically connected the problem of seed set expansions with personalized page rank by showing that the optimal weights of random walk landing probabilities when constructing a classifier for community detection of nodes in the stochastic block model followed the weights in personalized page rank [3]. One drawback of the stochastic block model is that it can be unrealistic as a model for empirical graphs. The degree-corrected stochastic block model is an extension of the stochastic block model that allows edge probabilities to differ both by which communities the nodes are in as well as by latent “friendliness” parameters that each node has. The goal of this project was to extend the analysis of [3] to the degree-corrected stochastic block model and determine (i) whether the optimal weights of a classifier using random walk probabilities follow some known node-ranking model (possibly even personalized page rank as with the standard stochastic block model), (ii) a closed form for the weights in terms of the model parameters, and (iii) to explore performance the practicality of such a classifier with experiments. With this project I was able to extend the analysis of [3] to the degree-corrected stochastic block model, deriving the optimal weights of a classifier in the space of normalized random walk probabilities in terms of the parameters of the model. My central result is a proof akin to the one shown in [3] and highlights that with high probability, random walk probabilities in the degree corrected stochastic block model concentrate around a distribution characterized by a smaller linear system on the blocks with a conditional distribution within blocks proportional to the “friendliness” of each user within that block. This is of independent theoretical interest in relating graph models and random walks (which in turn link to ranking models such as personalized page rank), and in conjunction with some simple parameter estimation was able to produce accurate classification on synthetic data as well as two empirical networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Consistency of community detection in networks under degree-corrected stochastic block models

Community detection is a fundamental problem in network analysis, with applications in many diverse areas. The stochastic block model is a common tool for model-based community detection, and asymptotic tools for checking consistency of community detection under the block model have been recently developed. However, the block model is limited by its assumption that all nodes within a community ...

متن کامل

Community Detection via Measure Space Embedding

We present a new algorithm for community detection. The algorithm uses random walks to embed the graph in a space of measures, after which a modification of k-means in that space is applied. The algorithm is therefore fast and easily parallelizable. We evaluate the algorithm on standard random graph benchmarks, including some overlapping community benchmarks, and find its performance to be bett...

متن کامل

Overlapping Communities Detection via Measure Space Embedding

We present a new algorithm for community detection. The algorithm uses random walks to embed the graph in a space of measures, after which a modification of k-means in that space is applied. The algorithm is therefore fast and easily parallelizable. We evaluate the algorithm on standard random graph benchmarks, including some overlapping community benchmarks, and find its performance to be bett...

متن کامل

Non-Backtracking Spectrum of Degree-Corrected Stochastic Block Models

Motivated by community detection, we characterise the spectrum of the non-backtracking matrix B in the Degree-Corrected Stochastic Block Model. Specifically, we consider a random graph on n vertices partitioned into two equalsized clusters. The vertices have i.i.d. weights {φu}u=1 with second moment Φ. The intra-cluster connection probability for vertices u and v is φuφv n a and the inter-clust...

متن کامل

Community Detection Using Slow Mixing Markov Models

The task of community detection in a graph formalizes the intuitive task of grouping together subsets of vertices such that vertices within clusters are connected tighter than those in disparate clusters. This paper approaches community detection in graphs by constructing Markov random walks on the graphs. The mixing properties of the random walk are then used to identify communities. We use co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017